Wikileaks Twitter DM

About

On the 29th of July 2018, Emma Best published on her website the copy of 11k+ wikileaks Twitter DM : https://emma.best/2018/07/29/11000-messages-from-private-wikileaks-chat-released/

Here is a data extraction and wrangling of this corpus, to make it easily searchable, extractable and sharable.

How to use this page

  • Every “ link.csv” is a downloadable csv.
  • You can search and order every table. Results of the search are downloadable as csv or can be copied in the clipboard.
  • You can zoom in the time series by selecting the date range. You can also use the selector beside to choose this range. Double click to reset the settings.
  • Under each dynamic plot, you can find a static plot by clicking on “Static plot”.

The datasets:

List of all DMs

A dataset with 3 columns:

  • text: extracted text
  • date: date of the dm
  • user: user who sent the dm

wikileaks_dm.csv

Count of daily DMs

A dataset with 2 columns

  • date: the date
  • n: number of DMs

daily.csv

Static plot

DMs by year

3 datasets (1 per year), each with 3 columns:

  • text: extracted text
  • date: date of the dm
  • user: user who sent the dm

2015

2015.csv

Static plot

2016

2016.csv

Static plot

2017

2017.csv

Static plot

Count of user participation

A dataset with 2 columns

  • user: the user
  • n: number of DMs in the corpus

user_count.csv

Mentions

DMs that contains a mention to a Twitter account.

A dataset with 4 columns

  • mention: the mentioned account
  • text: extracted text
  • date: the date
  • user: user who sent the dm

mentions.csv

Count of the mentions:

A dataset with 2 columns

  • mention: the mention
  • n: number of DMs

mentions_count.csv

Urls

Extracted links, (starting with http).

A dataset with 4 columns

  • url: the found url
  • text: extracted text
  • date: the date
  • user: user who sent the dm

urls.csv

Methodology

Everything has been done in R.

Methodology is described in methodo